29 research outputs found
Two-step estimation of latent trait models
We consider two-step estimation of latent variable models, in which just the
measurement model is estimated in the first step and the measurement parameters
are then fixed at their estimated values in the second step where the
structural model is estimated. We show how this approach can be implemented for
latent trait models (item response theory models) where the latent variables
are continuous and their measurement indicators are categorical variables. The
properties of two-step estimators are examined using simulation studies and
applied examples. They perform well, and have attractive practical and
conceptual properties compared to the alternative one-step and three-step
approaches. These results are in line with previous findings for other families
of latent variable models. This provides strong evidence that two-step
estimation is a flexible and useful general method of estimation for different
types of latent variable models.Comment: 39 pages, 2 figures, 17 table
Relating latent class membership to external variables: an overview
In this article we provide an overview of existing approaches for relating latent class membership to external variables of interest. We extend on the work of Nylund-Gibson et al. (Structural Equation Modeling: A Multidisciplinary Journal, 2019, 26, 967), who summarize models with distal outcomes by providing an overview of most recommended modeling options for models with covariates and larger models with multiple latent variables as well. We exemplify the modeling approaches using data from the General Social Survey for a model with a distal outcome where underlying model assumptions are violated, and a model with multiple latent variables. We discuss software availability and provide example syntax for the real data examples in Latent GOLD
Unraveling the Skillsets of Data Scientists: Text Mining Analysis of Dutch University Master Programs in Data Science and Artificial Intelligence
The growing demand for data scientists in the global labor market and the
Netherlands has led to a rise in data science and artificial intelligence (AI)
master programs offered by universities. However, there is still a lack of
clarity regarding the specific skillsets of data scientists. This study aims to
address this issue by employing Correlated Topic Modeling (CTM) to analyse the
content of 41 master programs offered by seven Dutch universities. We assess
the differences and similarities in the core skills taught by these programs,
determine the subject-specific and general nature of the skills, and provide a
comparison between the different types of universities offering these programs.
Our findings reveal that research, data processing, statistics and ethics are
the predominant skills taught in Dutch data science and AI master programs,
with general universities emphasizing research skills and technical
universities focusing more on IT and electronic skills. This study contributes
to a better understanding of the diverse skillsets of data scientists, which is
essential for employers, universities, and prospective students
Two-step estimation of models between latent classes and external variables
We consider models which combine latent class measurement models for categorical latent variables with structural regression models for the relationships between the latent classes and observed explanatory and response variables. We propose a two-step method of estimating such models. In its first step the measurement model is estimated alone, and in the second step the parameters of this measurement model are held fixed when the structural model is estimated. Simulation studies and applied examples suggest that the two-step method is an attractive alternative to existing one-step and three-step methods. We derive estimated standard errors for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and show how the method can be implemented in existing software for latent variable modellin
Multilevel latent class analysis with covariates: Analysis of cross-national citizenship norms with a two-stage approach
This paper focuses on the substantive application of multilevel LCA to the
evolution of citizenship norms in a diverse array of democratic countries. To
do so, we present a two-stage approach to fit multilevel latent class models:
in the first stage (measurement model construction), unconditional class
enumeration is done separately on both low and high level latent variables,
estimating only a part of the model at a time -- hence keeping the remaining
part fixed -- and then updating the full measurement model; in the second stage
(structural model construction), individual and/or group covariates are
included in the model. By separating the two parts -- first stage and second
stage of model building -- the measurement model is stabilized and is allowed
to be determined only by it's indicators. Moreover, this two-step approach
makes the inclusion/exclusion of a covariate a relatively simple task to
handle. Our proposal amends common practice in applied social science research,
where simple (low-level) LCA is done to obtain a classification of low-level
unit, and this is then related to (low- and high-level) covariates simply
including group fixed effects. Our analysis identifies latent classes that
score either consistently high or consistently low on all measured items, along
with two theoretically important classes that place distinctive emphasis on
items related to engaged citizenship, and duty-based norms
A two-step estimator for multilevel latent class analysis with covariates
We propose a two-step estimator for multilevel latent class analysis (LCA)
with covariates. The measurement model for observed items is estimated in its
first step, and in the second step covariates are added in the model, keeping
the measurement model parameters fixed. We discuss model identification, and
derive an Expectation Maximization algorithm for efficient implementation of
the estimator. By means of an extensive simulation study we show that (i) this
approach performs similarly to existing stepwise estimators for multilevel LCA
but with much reduced computing time, and (ii) it yields approximately unbiased
parameter estimates with a negligible loss of efficiency compared to the
one-step estimator. The proposal is illustrated with a cross-national analysis
of predictors of citizenship norms.Comment: Manuscript version accepted for publication in Psychometrik
A two-step estimator for multilevel latent class analysis with covariates
We propose a two-step estimator for multilevel latent class analysis (LCA) with covariates. The measurement model for observed items is estimated in its first step, and in the second step covariates are added in the model, keeping the measurement model parameters fixed. We discuss model identification, and derive an Expectation Maximization algorithm for efficient implementation of the estimator. By means of an extensive simulation study we show that (1) this approach performs similarly to existing stepwise estimators for multilevel LCA but with much reduced computing time, and (2) it yields approximately unbiased parameter estimates with a negligible loss of efficiency compared to the one-step estimator. The proposal is illustrated with a cross-national analysis of predictors of citizenship norms
Modeling predictors of latent classes in regression mixture models
The purpose of this study is to provide guidance on a process for including latent class predictors in regression mixture models. We first examine the performance of current practice for using the 1-step and 3-step approaches where the direct covariate effect on the outcome is omitted. None of the approaches show adequate estimates of model parameters. Given that Step 1 of the 3-step approach shows adequate results in class enumeration, we suggest using an alternative approach: (a) decide the number of latent classes without predictors of latent classes, and (b) bring the latent class predictors into the model with the inclusion of hypothesized direct covariate effects. Our simulations show that this approach leads to good estimates for all model parameters. The proposed approach is demonstrated by using empirical data to examine the differential effects of family resources on students’ academic achievement outcome. Implications of the study are discussed